Programmable Contextual Analysis
نویسندگان
چکیده
We describe a method for rapid prototyping of contextual analysis algorithms within an experimental page reader. Due to the great variety of such algorithms and their dependency on details of the page-reader’s internal data structures, the state of the art today is that each new application requires custom low-level programming. This is undesirable since it impedes experimentation and restricts cost-effective applications to high-volume problems. To make contextual analysis more easily retargetable, we have designed a high-level language with primitives for traversing the document hierarchy and generating, scoring, sorting, and pruning interpretations. It is based on Ousterhout’s interpreted language tcl which provides constructs such as variables, decisions, looping, etc, and is easily extended by adding functions. Some functions are table-driven or built-in: for example, character typing, typographical morphology analysis, and regular expression matching. Other functions are normally implemented as separately executing UNIX processes communicating with the page reader via pipes. These may be pre-existing software tools imported from other research fields such as computational linguistics, information retrieval, and string matching. We illustrate the expressive power of the language in applications to English text using a spell-checker, Japanese text using character n-grams, and mixed Russian-English text using two lexicons with automatic context-switching.
منابع مشابه
Implementation of Face Recognition Algorithm on Fields Programmable Gate Array Card
The evolution of today's application technologies requires a certain level of robustness, reliability and ease of integration. We choose the Fields Programmable Gate Array (FPGA) hardware description language to implement the facial recognition algorithm based on "Eigen faces" using Principal Component Analysis. In this paper, we first present an overview of the PCA used for facial recognition,...
متن کاملA rule-based evaluation of ladder logic diagram and timed petri nets for programmable logic controllers
This paper describes an evaluation through a case study by measuring a rule-based approach, which proposed for ladder logic diagrams and Petri nets. In the beginning, programmable logic controllers were widely designed by ladder logic diagrams. When complexity and functionality of manufacturing systems increases, developing their software is becoming more difficult. Thus, Petri nets as a high l...
متن کاملLearning Privacy Expectations by Crowdsourcing Contextual Informational Norms
Designing programmable privacy logic frameworks that correspond to social, ethical, and legal norms has been a fundamentally hard problem. Contextual integrity (CI) (Nissenbaum 2010) offers a model for conceptualizing privacy that is able to bridge technical design with ethical, legal, and policy approaches. While CI is capable of capturing the various components of contextual privacy in theory...
متن کاملDesign and Implementation of Field Programmable Gate Array Based Baseband Processor for Passive Radio Frequency Identification Tag (TECHNICAL NOTE)
In this paper, an Ultra High Frequency (UHF) base band processor for a passive tag is presented. It proposes a Radio Frequency Identification (RFID) tag digital base band architecture which is compatible with the EPC C C2/ISO18000-6B protocol. Several design approaches such as clock gating technique, clock strobe design and clock management are used. In order to reduce the area Decimal Matrix C...
متن کاملAn Efficient Algorithm for Output Coding in Pal Based Cplds (TECHNICAL NOTE)
One of the approaches used to partition inputs consists in modifying and limiting the input set using an external transcoder. This method is strictly related to output coding. This paper presents an optimal output coding in PAL-based programmable transcoders. The algorithm can be used to implement circuits in PAL-based CPLDs.
متن کامل